Optimal Reduction of Rule Length in Linear Context-Free Rewriting Systems
نویسندگان
چکیده
Linear Context-free Rewriting Systems (LCFRS) is an expressive grammar formalism with applications in syntax-based machine translation. The parsing complexity of an LCFRS is exponential in both the rank of a production, defined as the number of nonterminals on its right-hand side, and a measure for the discontinuity of a phrase, called fan-out. In this paper, we present an algorithm that transforms an LCFRS into a strongly equivalent form in which all productions have rank at most 2, and has minimal fan-out. Our results generalize previous work on Synchronous Context-Free Grammar, and are particularly relevant for machine translation from or to languages that require syntactic analyses with discontinuous constituents.
منابع مشابه
Optimal Rank Reduction for Linear Context-Free Rewriting Systems with Fan-Out Two
Linear Context-Free Rewriting Systems (LCFRSs) are a grammar formalism capable of modeling discontinuous phrases. Many parsing applications use LCFRSs where the fan-out (a measure of the discontinuity of phrases) does not exceed 2. We present an efficient algorithm for optimal reduction of the length of production right-hand side in LCFRSs with fan-out at most 2. This results in asymptotical ru...
متن کاملOptimal Parsing Strategies for Linear Context-Free Rewriting Systems
Reduction is the operation of transforming a production in a Linear Context-Free Rewriting System (LCFRS) into two simpler productions by factoring out a subset of the nonterminals on the production’s righthand side. Reduction lowers the rank of a production but may increase its fan-out. We show how to apply reduction in order to minimize the parsing complexity of the resulting grammar, and stu...
متن کاملPosition-and-Length-Dependent Context-Free Grammars - A New Type of Restricted Rewriting
For many decades, the search for language classes that extend the context-free laguages enough to include various languages that arise in practice, while still keeping as many of the useful properties that context-free grammars have – most notably cubic parsing time – has been one of the major areas of research in formal language theory. In this thesis we add a new family of classes to this fie...
متن کاملOn the Parameterized Complexity of Linear Context-Free Rewriting Systems
We study the complexity of uniform membership for Linear Context-Free Rewriting Systems, i.e., the problem where we are given a string w and a grammar G and are asked whether w ∈ L(G). In particular, we use parameterized complexity theory to investigate how the complexity depends on various parameters. While we focus primarily on rank and fan-out, derivation length is also considered.
متن کاملAn Optimal-Time Binarization Algorithm for Linear Context-Free Rewriting Systems with Fan-Out Two
Linear context-free rewriting systems (LCFRSs) are grammar formalisms with the capability of modeling discontinuous constituents. Many applications use LCFRSs where the fan-out (a measure of the discontinuity of phrases) is not allowed to be greater than 2. We present an efficient algorithm for transforming LCFRS with fan-out at most 2 into a binary form, whenever this is possible. This results...
متن کامل